DGPathinter: a novel model for identifying driver genes via knowledge-driven matrix factorization with prior knowledge from interactome and pathways

نویسندگان

  • Jianing Xi
  • Minghui Wang
  • Ao Li
چکیده

Cataloging mutated driver genes that confer a selective growth advantage for tumor cells from sporadic passenger mutations is a critical problem in cancer genomic research. Previous studies have reported that some driver genes are not highly frequently mutated and cannot be tested as statistically significant, which complicates the identification of driver genes. To address this issue, some existing approaches incorporate prior knowledge from an interactome to detect driver genes which may be dysregulated by interaction network context. However, altered operations of many pathways in cancer progression have been frequently observed, and prior knowledge from pathways is not exploited in the driver gene identification task. In this paper, we introduce a driver gene prioritization method called driver gene identification through pathway and interactome information (DGPathinter), which is based on knowledge-based matrix factorization model with prior knowledge from both interactome and pathways incorporated. When DGPathinter is applied on somatic mutation datasets of three types of cancers and evaluated by known driver genes, the prioritizing performances of DGPathinter are better than the existing interactome driven methods. The top ranked genes detected by DGPathinter are also significantly enriched for known driver genes. Moreover, most of the top ranked scored pathways given by DGPathinter are also cancer progressionassociated pathways. These results suggest that DGPathinter is a useful tool to identify potential driver genes. Subjects Bioinformatics, Computational Biology

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reconstruct modular phenotype-specific gene networks by knowledge-driven matrix factorization

MOTIVATION Reconstructing gene networks from microarray data has provided mechanistic information on cellular processes. A popular structure learning method, Bayesian network inference, has been used to determine network topology despite its shortcomings, i.e. the high-computational cost when analyzing a large number of genes and the inefficiency in exploiting prior knowledge, such as the co-re...

متن کامل

Using Knowledge Driven Matrix Factorization to Reconstruct Modular Gene Regulatory Network

Reconstructing gene networks frommicro-array data can provide information on the mechanisms that govern cellular processes. Numerous studies have been devoted to addressing this problem. A popular method is to view the gene network as a Bayesian inference network, and to apply structure learning methods to determine the topology of the gene network. There are, however, several shortcomings with...

متن کامل

Identifying and Ranking Development Drivers of Knowledge-based Technology-Driven Companies (Case study: Fars Province Science and Technology Park)

The purpose of this Study study is to identify and rank the development drivers of knowledge-based, technology-driven businesses. This work is conducted as a case study in Fars Province Science and Technology Park. It is a descriptive survey in terms of purpose since a part of its data is collected through questionnaires and is of surveying type because it describes the existing conditions. The...

متن کامل

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

Modeling Signal Transduction from Protein Phosphorylation to Gene Expression

BACKGROUND Signaling networks are of great importance for us to understand the cell's regulatory mechanism. The rise of large-scale genomic and proteomic data, and prior biological knowledge has paved the way for the reconstruction and discovery of novel signaling pathways in a data-driven manner. In this study, we investigate computational methods that integrate proteomics and transcriptomic d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017